{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "K45xhdPRsZJV"
   },
   "source": [
    "# Movie Recommendation System using Doc2vec Embeddings and Vector DB"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XUj6NXD0sdgf"
   },
   "source": [
    "This Colab notebook aims to illustrate the process of creating a recommendation system using embeddings and a Vector DB.\n",
    "\n",
    "This approach involves combining the various movie genres or characteristics of a movie to form Doc2Vec embeddings, which offer a comprehensive portrayal of the movie content.\n",
    "\n",
    "These embeddings serve dual purposes: they can either be directly inputted into a classification model for genre classification or stored in a VectorDB. By storing embeddings in a VectorDB, efficient retrieval and query search for recommendations become possible at a later stage.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qEa74a_Wtpc7"
   },
   "source": [
    "### Installing the relevant dependencies\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "hyde90IntuFi"
   },
   "outputs": [],
   "source": [
    "!pip install torch scikit-learn lancedb nltk gensim lancedb scipy==1.12 kaggle"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "shPjHTZbtxTh"
   },
   "source": [
    "## Kaggle Configuration and Data Needs\n",
    "\n",
    "We are using a movies metadata data which is being uploaded on the Kaggle. To download the dataset and use it for our recommendation system, we will need a `kaggle.json` file containing our creds.\n",
    "\n",
    "You can download the `kaggle.json` file from your Kaggle account settings. Follow these steps and make your life easy.\n",
    "\n",
    "1. Go to Kaggle and log in to your account.\n",
    "2. Navigate to Your Account Settings and click on your profile picture in the top right corner of the page, Now From the dropdown menu, select `Account`.\n",
    "3. Scroll down to the `API` section, Click on `Create New API Token`. This will download a file named kaggle.json to your computer.\n",
    "\n",
    "Once you have the `kaggle.json` file, you need to upload it here on colab data space. After uploading the `kaggle.json` file, run the following code to set up the credentials and download the dataset in `data` directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6Tl2qzgKsWtF"
   },
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "\n",
    "# Assuming kaggle.json is uploaded to the current directory\n",
    "with open(\"kaggle.json\") as f:\n",
    "    kaggle_credentials = json.load(f)\n",
    "\n",
    "os.environ[\"KAGGLE_USERNAME\"] = kaggle_credentials[\"username\"]\n",
    "os.environ[\"KAGGLE_KEY\"] = kaggle_credentials[\"key\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "8va-0of3sU0x"
   },
   "outputs": [],
   "source": [
    "from kaggle.api.kaggle_api_extended import KaggleApi\n",
    "\n",
    "# Initialize the Kaggle API\n",
    "api = KaggleApi()\n",
    "api.authenticate()\n",
    "\n",
    "# Specify the dataset you want to download\n",
    "dataset = \"rounakbanik/the-movies-dataset\"\n",
    "destination = \"data/\"\n",
    "\n",
    "# Create the destination directory if it doesn't exist\n",
    "if not os.path.exists(destination):\n",
    "    os.makedirs(destination)\n",
    "\n",
    "# Download the dataset\n",
    "api.dataset_download_files(dataset, path=destination, unzip=True)\n",
    "\n",
    "print(f\"Dataset {dataset} downloaded to {destination}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "hBYzad3lrY4e",
    "outputId": "5a8f7983-80be-47e0-aa9c-ae4e10495c1e"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 1000/1000 [00:00<00:00, 5050.83it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5161.29it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5006.18it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5222.83it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5216.24it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5171.35it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5109.78it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5222.42it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5133.39it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5024.74it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5117.18it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4963.78it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5405.55it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5369.51it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5349.33it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5374.53it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5194.32it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5296.75it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5204.32it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5309.43it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5333.12it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5289.35it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5317.42it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5322.46it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5378.43it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5488.32it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5546.43it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 2502.38it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5369.91it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4354.99it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5193.60it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5536.27it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 3476.56it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4819.07it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4500.37it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5184.11it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5098.14it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5523.73it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4655.12it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5113.63it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5336.63it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5564.83it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5310.91it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 5533.46it/s]\n",
      "100%|██████████| 1000/1000 [00:00<00:00, 4255.41it/s]\n",
      "100%|██████████| 466/466 [00:00<00:00, 5617.03it/s]\n",
      "Building Vocabulary: 100%|██████████| 44506/44506 [00:00<00:00, 104121.48it/s]\n",
      "Epoch 1: 100%|██████████| 44506/44506 [00:02<00:00, 20444.80it/s]\n",
      "Epoch 2: 100%|██████████| 44506/44506 [00:02<00:00, 20700.43it/s]\n",
      "Epoch 3: 100%|██████████| 44506/44506 [00:02<00:00, 20831.06it/s]\n",
      "Epoch 4: 100%|██████████| 44506/44506 [00:02<00:00, 20885.78it/s]\n",
      "Epoch 5: 100%|██████████| 44506/44506 [00:02<00:00, 19616.38it/s]\n",
      "Epoch 6: 100%|██████████| 44506/44506 [00:02<00:00, 19634.24it/s]\n",
      "Epoch 7: 100%|██████████| 44506/44506 [00:02<00:00, 20579.08it/s]\n",
      "Epoch 8: 100%|██████████| 44506/44506 [00:02<00:00, 20727.00it/s]\n",
      "Epoch 9: 100%|██████████| 44506/44506 [00:02<00:00, 21242.19it/s]\n",
      "Epoch 10: 100%|██████████| 44506/44506 [00:02<00:00, 18476.39it/s]\n",
      "Epoch 11: 100%|██████████| 44506/44506 [00:02<00:00, 21169.07it/s]\n",
      "Epoch 12: 100%|██████████| 44506/44506 [00:02<00:00, 20967.64it/s]\n",
      "Epoch 13: 100%|██████████| 44506/44506 [00:02<00:00, 20192.34it/s]\n",
      "Epoch 14: 100%|██████████| 44506/44506 [00:02<00:00, 18910.62it/s]\n",
      "Epoch 15: 100%|██████████| 44506/44506 [00:02<00:00, 20810.41it/s]\n",
      "Epoch 16: 100%|██████████| 44506/44506 [00:02<00:00, 21361.88it/s]\n",
      "Epoch 17: 100%|██████████| 44506/44506 [00:02<00:00, 18440.51it/s]\n",
      "Epoch 18: 100%|██████████| 44506/44506 [00:02<00:00, 21206.01it/s]\n",
      "Epoch 19: 100%|██████████| 44506/44506 [00:02<00:00, 20086.00it/s]\n",
      "Epoch 20: 100%|██████████| 44506/44506 [00:02<00:00, 20943.08it/s]\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.optim as optim\n",
    "from torch.utils.data import DataLoader, TensorDataset\n",
    "from gensim.models.doc2vec import Doc2Vec, TaggedDocument\n",
    "from nltk.tokenize import word_tokenize\n",
    "from sklearn.preprocessing import MultiLabelBinarizer\n",
    "from sklearn.model_selection import train_test_split\n",
    "from tqdm import tqdm\n",
    "\n",
    "# Read data from CSV file\n",
    "movie_data = pd.read_csv(\n",
    "    \"/Users/vipul/Nova/Projects/genre_spectrum/movies_metadata.csv\", low_memory=False\n",
    ")\n",
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "\n",
    "\n",
    "def preprocess_data(movie_data_chunk):\n",
    "    tagged_docs = []\n",
    "    valid_indices = []\n",
    "    movie_info = []\n",
    "\n",
    "    # Wrap your loop with tqdm\n",
    "    for i, row in tqdm(movie_data_chunk.iterrows(), total=len(movie_data_chunk)):\n",
    "        try:\n",
    "            # Constructing movie text\n",
    "            movies_text = \"\"\n",
    "            movies_text += \"Overview: \" + row[\"overview\"] + \"\\n\"\n",
    "            genres = \", \".join([genre[\"name\"] for genre in eval(row[\"genres\"])])\n",
    "            movies_text += \"Overview: \" + row[\"overview\"] + \"\\n\"\n",
    "            movies_text += \"Genres: \" + genres + \"\\n\"\n",
    "            movies_text += \"Title: \" + row[\"title\"] + \"\\n\"\n",
    "            tagged_docs.append(\n",
    "                TaggedDocument(words=word_tokenize(movies_text.lower()), tags=[str(i)])\n",
    "            )\n",
    "            valid_indices.append(i)\n",
    "            movie_info.append((row[\"title\"], genres))\n",
    "        except Exception as e:\n",
    "            continue\n",
    "\n",
    "    return tagged_docs, valid_indices, movie_info\n",
    "\n",
    "\n",
    "def train_doc2vec_model(tagged_data, num_epochs=20):\n",
    "    # Initialize Doc2Vec model\n",
    "    doc2vec_model = Doc2Vec(vector_size=100, min_count=2, epochs=num_epochs)\n",
    "    doc2vec_model.build_vocab(tqdm(tagged_data, desc=\"Building Vocabulary\"))\n",
    "    for epoch in range(num_epochs):\n",
    "        doc2vec_model.train(\n",
    "            tqdm(tagged_data, desc=f\"Epoch {epoch+1}\"),\n",
    "            total_examples=doc2vec_model.corpus_count,\n",
    "            epochs=doc2vec_model.epochs,\n",
    "        )\n",
    "\n",
    "    return doc2vec_model\n",
    "\n",
    "\n",
    "# Preprocess data and extract genres for the first 1000 movies\n",
    "chunk_size = 1000\n",
    "tagged_data = []\n",
    "valid_indices = []\n",
    "movie_info = []\n",
    "for chunk_start in range(0, len(movie_data), chunk_size):\n",
    "    movie_data_chunk = movie_data.iloc[chunk_start : chunk_start + chunk_size]\n",
    "    chunk_tagged_data, chunk_valid_indices, chunk_movie_info = preprocess_data(\n",
    "        movie_data_chunk\n",
    "    )\n",
    "    tagged_data.extend(chunk_tagged_data)\n",
    "    valid_indices.extend(chunk_valid_indices)\n",
    "    movie_info.extend(chunk_movie_info)\n",
    "\n",
    "doc2vec_model = train_doc2vec_model(tagged_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VryHT1zVuEp0"
   },
   "source": [
    "### Training a Neural Network for the Genre Classification Task"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "3pVNy2UKt5lu"
   },
   "outputs": [],
   "source": [
    "# Extract genre labels for the valid indices\n",
    "genres_list = []\n",
    "for i in valid_indices:\n",
    "    row = movie_data.loc[i]\n",
    "    genres = [genre[\"name\"] for genre in eval(row[\"genres\"])]\n",
    "    genres_list.append(genres)\n",
    "\n",
    "mlb = MultiLabelBinarizer()\n",
    "genre_labels = mlb.fit_transform(genres_list)\n",
    "\n",
    "embeddings = []\n",
    "for i in valid_indices:\n",
    "    embeddings.append(doc2vec_model.dv[str(i)])\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    embeddings, genre_labels, test_size=0.2, random_state=42\n",
    ")\n",
    "\n",
    "X_train_np = np.array(X_train, dtype=np.float32)\n",
    "y_train_np = np.array(y_train, dtype=np.float32)\n",
    "X_test_np = np.array(X_test, dtype=np.float32)\n",
    "y_test_np = np.array(y_test, dtype=np.float32)\n",
    "\n",
    "X_train_tensor = torch.tensor(X_train_np)\n",
    "y_train_tensor = torch.tensor(y_train_np)\n",
    "X_test_tensor = torch.tensor(X_test_np)\n",
    "y_test_tensor = torch.tensor(y_test_np)\n",
    "\n",
    "\n",
    "class GenreClassifier(nn.Module):\n",
    "    def __init__(self, input_size, output_size):\n",
    "        super(GenreClassifier, self).__init__()\n",
    "        self.fc1 = nn.Linear(input_size, 512)\n",
    "        self.bn1 = nn.BatchNorm1d(512)\n",
    "        self.fc2 = nn.Linear(512, 256)\n",
    "        self.bn2 = nn.BatchNorm1d(256)\n",
    "        self.fc3 = nn.Linear(256, 128)\n",
    "        self.bn3 = nn.BatchNorm1d(128)\n",
    "        self.fc4 = nn.Linear(128, output_size)\n",
    "        self.relu = nn.ReLU()\n",
    "        self.dropout = nn.Dropout(p=0.2)  # Adjust the dropout rate as needed\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.fc1(x)\n",
    "        x = self.bn1(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.dropout(x)\n",
    "        x = self.fc2(x)\n",
    "        x = self.bn2(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.dropout(x)\n",
    "        x = self.fc3(x)\n",
    "        x = self.bn3(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.dropout(x)\n",
    "        x = self.fc4(x)\n",
    "        return x\n",
    "\n",
    "\n",
    "# Move model to the selected device\n",
    "model = GenreClassifier(input_size=100, output_size=len(mlb.classes_)).to(device)\n",
    "\n",
    "# Define loss function and optimizer\n",
    "criterion = nn.BCEWithLogitsLoss()\n",
    "optimizer = optim.Adam(model.parameters(), lr=0.001)\n",
    "\n",
    "# Training loop\n",
    "epochs = 50\n",
    "batch_size = 64\n",
    "\n",
    "train_dataset = TensorDataset(X_train_tensor.to(device), y_train_tensor.to(device))\n",
    "train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)\n",
    "\n",
    "for epoch in range(epochs):\n",
    "    model.train()\n",
    "    running_loss = 0.0\n",
    "    for inputs, labels in train_loader:\n",
    "        inputs, labels = inputs.to(device), labels.to(device)  # Move data to device\n",
    "        optimizer.zero_grad()\n",
    "        outputs = model(inputs)\n",
    "        loss = criterion(outputs, labels)\n",
    "        loss.backward()\n",
    "        optimizer.step()\n",
    "        running_loss += loss.item() * inputs.size(0)\n",
    "    epoch_loss = running_loss / len(train_loader.dataset)\n",
    "    print(f\"Epoch [{epoch + 1}/{epochs}], Loss: {epoch_loss:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yV8lTDYIubEQ"
   },
   "source": [
    "### Testing the `model` to see if our model is able to predict the genres for the movies from the test dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "73D3aqdJuct8"
   },
   "outputs": [],
   "source": [
    "from sklearn.metrics import f1_score\n",
    "\n",
    "model.eval()\n",
    "with torch.no_grad():\n",
    "    X_test_tensor, y_test_tensor = X_test_tensor.to(device), y_test_tensor.to(\n",
    "        device\n",
    "    )  # Move test data to device\n",
    "    outputs = model(X_test_tensor)\n",
    "    test_loss = criterion(outputs, y_test_tensor)\n",
    "    print(f\"Test Loss: {test_loss.item():.4f}\")\n",
    "\n",
    "\n",
    "thresholds = [0.1] * len(mlb.classes_)\n",
    "thresholds_tensor = torch.tensor(thresholds, device=device).unsqueeze(0)\n",
    "\n",
    "# Convert the outputs to binary predictions using varying thresholds\n",
    "predicted_labels = (outputs > thresholds_tensor).cpu().numpy()\n",
    "\n",
    "# Convert binary predictions and actual labels to multi-label format\n",
    "predicted_multilabels = mlb.inverse_transform(predicted_labels)\n",
    "actual_multilabels = mlb.inverse_transform(y_test_np)\n",
    "\n",
    "# Print the Predicted and Actual Labels for each movie\n",
    "for i, (predicted, actual) in enumerate(zip(predicted_multilabels, actual_multilabels)):\n",
    "    print(f\"Movie {i+1}:\")\n",
    "    print(f\"    Predicted Labels: {predicted}\")\n",
    "    print(f\"    Actual Labels: {actual}\")\n",
    "\n",
    "\n",
    "# Compute F1-score\n",
    "f1 = f1_score(y_test_np, predicted_labels, average=\"micro\")\n",
    "print(f\"F1-score: {f1:.4f}\")\n",
    "\n",
    "# Saving the trained model\n",
    "torch.save(model.state_dict(), \"trained_model.pth\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kZrHpMm4un0G"
   },
   "source": [
    "### Storing the Doc2Vec Embeddings into LanceDB VectorDatabase"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "BTTNb9irrY4h"
   },
   "outputs": [],
   "source": [
    "import lancedb\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "data = []\n",
    "\n",
    "for i in valid_indices:\n",
    "    embedding = doc2vec_model.dv[str(i)]\n",
    "    title, genres = movie_info[valid_indices.index(i)]\n",
    "    data.append({\"title\": title, \"genres\": genres, \"vector\": embedding.tolist()})\n",
    "\n",
    "db = lancedb.connect(\".db\")\n",
    "tbl = db.create_table(\"doc2vec_embeddings\", data, mode=\"Overwrite\")\n",
    "db[\"doc2vec_embeddings\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ciUFn7uQrY4i"
   },
   "outputs": [],
   "source": [
    "def get_recommendations(title):\n",
    "    pd_data = pd.DataFrame(data)\n",
    "    result = (\n",
    "        tbl.search(pd_data[pd_data[\"title\"] == title][\"vector\"].values[0])\n",
    "        .metric(\"cosine\")\n",
    "        .limit(10)\n",
    "        .to_pandas()\n",
    "    )\n",
    "    return result[[\"title\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8Kz-JGsTuwmk"
   },
   "source": [
    "### D-Day : Let's generate some recommendations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "uw_El12JrY4j",
    "outputId": "c245bab5-7966-4fd1-ec72-37f708c3b570"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Vertical Limit</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Demons of War</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Fear and Desire</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Escape from Sobibor</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Last Girl Standing</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>K2: Siren of the Himalayas</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Ghost Ship</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Camp Massacre</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Captain Nemo and the Underwater City</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Seas Beneath</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  title\n",
       "0                        Vertical Limit\n",
       "1                         Demons of War\n",
       "2                       Fear and Desire\n",
       "3                   Escape from Sobibor\n",
       "4                    Last Girl Standing\n",
       "5            K2: Siren of the Himalayas\n",
       "6                            Ghost Ship\n",
       "7                         Camp Massacre\n",
       "8  Captain Nemo and the Underwater City\n",
       "9                          Seas Beneath"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_recommendations(\"Vertical Limit\")"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
